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Abstract — Signal reconstruction in compressive sensing 
involves finding a sparse solution that satisfies a set of 
linear constraints. Several approaches to this problem 
have been considered in existing reconstruction algorithms. 
They each provide a trade-off between reconstruction 
capabilities and required computation time. In an attempt 
to push the limits for this trade-off, we consider a smoothed 
£o norm (SLO) algorithm in a noiseless setup. We argue 
that using a set of carefully chosen parameters in our 
proposed adaptive SLO algorithm may result in signifi- 
cantly better reconstruction capabiUties in terms of phase 
transition while retaining the same required computation 
time as existing SLO algorithms. A large set of simulations 
further support this claim. Simulations even reveal that 
the theoretical £i curve may be surpassed in major parts 
of the phase space. 



I. Introduction 

THE Compressive Sensing (CS) signal acquisition 
paradigm asserts that one can successfully recover 
certain signals sampled far below their Nyquist frequen- 
cies given they are sparse in some dictionary [1]. The 
Fourier dictionary for frequency sparse signals is an 
example of this. Encouraged by this assertion, the usual 
sample and then compress setup can be combined into a 
single efficient step. Signals acquired in this fashion do, 
however, have to be reconstructed which, in the noiseless 
case, entails a non-convex optimisation problem of the 
form: 



nummise ||x||o 
subject to y = Ax 



(1) 



where x G R^^^ is the reconstructed signal, A € R"^^ 
is a known measurement matrix, and y G M"^^ is the 
measured signal with n <^ N. In the CS context, n is 
the number of samples sensed while N is the number of 
samples in the original signal. We take ||x||o to denote 
the £o pseudo norm from [2], i.e. the number of non-zero 
entries in x. Solving the combinatorial problem in (1) by 
an exhaustive search is generally infeasible. 
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One feasible approach in reconstructing the signal is 
to relax the problem in (1) by substituting the £i norm 
for the io making the problem a linear program (LP) 
[3]. Another feasible approach is taken by the family 
of so-called iterative greedy algorithms. In these, the 
problem in (1) is reversed by minimising the residual 
of the energy of ||y — AxUl subject to some sparsity 
enforcing constraint. Abstractly, the greedy algorithms 
can be separated into two classes [4]: 1) Simple one 
stage algorithms which use a single greedy step in each 
iteration. Examples are Matching Pursuit (MP) [5] and 
Iterative Hard Thresholding (IHT) [6]. 2) Composite two 
stage algorithms which combine a greedy step with a re- 
finement step in each iteration. Examples are Orthogonal 
Matching Pursuit (OMP) [7] and CoSaMP [8]. The main 
advantage of the greedy algorithms over the £i approach 
is that they are computationally less complex [9] and 
require less computation time than state-of-the-art LP 
solvers [10]. 

In addition to the computation time, a measure of the 
reconsttuction quahty must be considered. Recently, the 
measure of phase transition [11] has become a standard 
way to specify reconstruction capabilities, see e.g. [4], 
[12], [13], [14]. Phase transitions evaluate the probability 
of successful reconstruction versus the indeterminacy of 
the constraints y = Ax and the true sparsity of x. In 
general, the main advantage of the £i approach over the 
greedy algorithms is that it is superior in terms of phase 
transition [4]. 

In search of a fast algorithm with a phase transition 
similar to that of the £i approach, it has been proposed to 
solve (1) by approximating the Iq norm with a continu- 
ous function [15]. The resulting smoothed £o norm (SLO) 
algorithm has a better phase transition than the greedy al- 
gorithms while requiring considerably less computation 
time than the state-of-the-art LP solvers. In this paper, 
we show that a few key parameters must be carefully 
selected and knowledge of the indeterminacy exploited 
to fully unleash the potential of SLO. We provide a set 
of empirically determined recommended parameters for 
a modified SLO algorithm that may dramatically improve 
its phase transition. Through extensive simulations, the 
claim of superiority of the recommended parameters is 
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supported. Finally, we discuss implementation strategies 
that speed up the algorithm by exploiting knowledge of 
the indeterminacy. 

The paper is organised as follows. In Section IT, 
we restate the SLO algorithm and present the proposed 
algorithm. Implementations of SLO that yield reduced 
computation time are discussed in Section III. Section IV 
describes the setup used for simulations while Section 
V provides the simulation results. A discussion of the 
results is given in Section VI. Finally, conclusions are 
stated in Section VII. 

II. Smoothed £o Norm 

SLO attempts to solve the problem in (1) by approximat- 
ing the £o norm with a continuous function. Consider the 
continuous Gaussian function with the parameter a: 

f^{x) = exp(-a;V(2 • cr^)) , x G M, a G M+ (2) 

The parameter a may be used to control the accuracy 
with which approximates the Kronecker delta. In 
mathematical terms, we have [15]: 

, . / 1, |x| < (7 



is convex and gradually increase the accuracy of the 

approximation. By careful selection of the sequence of 
(t's, (hopefully) non-convexity and thereby local minima 
are avoided. In the SLO algorithm stated below, we 
let denote the Moore-Penrose pseudo-inverse of the 
matrix A and let x o y denote the Hadamard product 
(entry wise multiplication) of the vectors x and y. 
Furthermore, we let exp(x) = [exp(xi) . . . exp(xAr)]"^ 
and max |x| = max{|xi| , . . . , |xjv|}. 

SLO algorithm: 

1 initialise: a^p = 0.5, a mm = 0.01, /x = 1, L = 3 

2 X A'I'y 

3 a <— 2 ■ max |x| 

4 while a > (Jmin do 

5 for i = 1 ... L do 

6 d ^ X o exp(-(xo x)/(2 • (7^)) 

7 X X — /X • d 

8 X X — (Ax — y) 

9 (7 <(- cr • (Tup 

For each a, the problem in (7) is solved by repeatedly 
taking an unconstrained gradient descent step in line 7 
(d is a normalised gradient of g) and projecting x back 
onto the feasible set in line 8. Substituting line 7 into 
line 8 results in the expression: 

X X — /X • (I - At A)d (8) 



Define the continuous multivariate function g as: 

N 

5(x)^5;/,(x,), xGM^xi (5) 

i=l 

Since the number of entries in x is A'^ and the function 
5 is an indicator of the number of zero-entries in x, the 
£o norm of the reconstructed vector x is approximated 
by: 

||x||o«iV-5(x) (6) 

Substituting this approximation into (1) yields the prob- 
lem: 

minimise N — q(x) 

: (7) 

subject to y = Ax 

The approach is then to solve the problem in (7) for a 
decreasing sequence of u's. The underlying thought is 
to select a a which ensures that the initial solution is 
in the subset of M^^^ over which the approximation 



where I G M is the identity matrix. The matrix 
I — A^A is a projector onto the null space of A. 
Consequently, as pointed out in [16], the unconstrained 
gradient descent step followed by projection back onto 
the feasible set is equivalent to a direct gradient descent 
step on the feasible set. The reason is that the projector 
I — A^ A restricts the gradient descent step to be on the 
feasible set. The parameter (Tup is the constant (typically 
chosen in the interval [0.5; 1[) in the geometric sequence 
of (t's while a min relates to the final value used for a 
and hence this controls the quality of the reconstruction. 
The optimal choice of a^p and is problem dependent 
while an initial a = 2 ■ max |x| as well as /i = 1 and 
L = 3 are problem independent recommendations given 
in [15]. 

A. Improving the Phase Transition 

A comprehensive proof of the convergence of the SLO 
algorithm exists for a specific set of parameters provided 
that an Asymmetric Resticted Isometry Property (ARIP) 
constraint is satisfied [17]. The authors of the proof 
conclude that though theoretically satisfactory, the ARIP 
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constraint leads to an unnecessarily pessimistic choice 
of parameter values. Motivated by this conclusion, we 
have carried out an extensive empirical analysis with the 
objective of finding the optimal parameter values. This 
analysis has shown that the standard step-size /i = 1 
and iteration number L = 3 provided in [15] result 
in a suboptimal phase transition. Instead, we suggest a 
modification to the SLO algorithm which may greatly 
improve the phase transition. 

Our initial observation is that the SLO algorithm pre- 
sented here is outperformed by the greedy IHT algorithm 
described in [4] in terms of phase transition for indeter- 
minacy 5 = n/N < 0.8. By introducing an update of the 
L parameter, the phase transition of the SLO algorithm 
can be improved to lie between that of the greedy IHT 
algorithm and that of the l\ approach. Specifically, we 
obtained this improvement by multiplying L with a 
factor > 1.5 after each update of a. By carefully 
selecting sequences of //'s and a's, the phase transition 
of the SLO algorithm can be even further improved to 
consistently lie on or above that of the li approach. In 
brief, we chose the step-size in the order of 10~^ for 
the first few cr's and in the order of 1 for the last u's. 
Also, we chose the initial a based on knowledge of the 
indeterminacy 5 to improve the phase transition across 
all 6. 

Based on our experimental results, we propose the 
following strategy. We choose a step-size of jjL = 
0.001 for the first three a's and a step-size of /x = 
1.4 for the last d's. We use a sequence of yu = 
[0.001, 0.001, 0.001, 0.05, 0.06, 1.4, . . . , 1.4]^ where the 
number of entries equals the number of cr's used. Fur- 
thermore, an inversely proportional relation between 5 
and the initial value of a yielded the most promis- 
ing phase transition. Specifically, we choose an initial 
a = 1/(2.75 • 5) ■ max|x| and combine this choice 
with a (Tup = 0.7. Finally, a gradually increasing L 
for decreasing a still provides an improvement for the 
updated parameter choices. Here, we choose a geometric 
sequence starting with L = 2 and increasing by a factor 
of Lap = 1.9 for each update of a. 

With an increased value of a^p and gradually in- 
creasing values of L, the computation time is bound to 
increase. This effect can, however, be counteracted by 
introducing a stopping criterion in the inner loop of the 
SLO algorithm. Therefore, we choose to terminate the 
inner loop when the relative change ||x — Xprevlb falls 
below a ■ e where e = 0.01. Generally, this measure 
has proved to be a good indicator of convergence and 
significantly reduced the average number of iterations 
taken in the inner loop. We now propose the smoothed 
£o norm algorithm with modified step-size (SLO MSS) 



which incorporates all of the above findings. 
SLO MSS algorithm: 



1 initialise: a, 



0.7, CJmm = 0.01, L = 2.0, 
Lup = 1.9, e = 10-2, k = 1 



up 



2 X ^ A^y 

3 [0.001, 0.001, 0.001, 0.05, 0.06, 1.4, ...,1.4]'^ 

4 cr <(- 1/(2.75 • (5) • max jxj 

5 while a > (Jmin do 

6 Xprev ^ 

7 i = 

8 while jjx — Xprev||2 > (T ■ e and i < L do 



X, 



prev 



X 



9 

10 d X o exp(— (x o x)/(2 • cr^)) 

11 X ^ X - //fc • (I - A'l'Aj'^d 

12 i <— i + 1 

13 (T ■<— cr • (Tup; L ^ L ■ Lup; k k + \; 



III. Implementations of SLO 

Through experiments, we have found the most time 
consuming parts of the SLO MSS algorithm to be the 
computation of A^ and the matrix-vector products in 
the gradient descent step. We now provide a strategy for 
exploiting the known indeterminacy S = n/N to reduce 
the number of required computations. 

Consider the full QR decomposition of A^ [18]: 



[Qi Q2 



(9) 



where Qi G M^^" and Q2 G RNx{N-n) f^^^ ^^^^^ 
for the row space and null space of A, respectively 



and R G 



is upper triangular. Note how the 



dimensions of Qi, Q2, and R change with S. From the 
QR decomposition, the pseudo-inverse At can be found: 



At = QiR 



(10) 



Three equivalent expressions for the projector that 
projects onto the null space of A are then: 



I-AtA = I-QiQ?^ = Q2Q^ 



(11) 



We now propose the following scheme for an implemen- 
tation of the SLO MSS algorithm. When 5 < 0.5, use 
the projector A^ and split the update in line II in an 
infeasible gradient descent step followed by a projection 
onto the feasible set: 
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X <(- X - /X • d (12) 
X ^ X- A^(Ax-y) (13) 

When S > 0.5, use the projector Q2QJ and split the 
update in line 1 1 in two steps: 

V ^ Qjd (14) 
X X - /Xjfc • Q2V (15) 

To appreciate the split procedure, note that fewer arith- 
metic operations are needed in computing the matrix- 
vector product (Q2Qj)d this way when: 

Flops{(Q2Qj)d} > 

Flops{Qjd} + Flops{Q2v} (16) 

which implies that: 

N-{2N -1)> 

{N-n)-{2N-l) + N-{2-{N -n)- 1) (17) 

or using an approximation: 

n>N/2 o 5^1/2 (18) 

where Flops{x} denotes the number of floating point 
operations in the computation of the expression x. The 
split procedure does not require the explicit forming of 
At since the initial least squares (least norm) solution 
X = A^y in Hne 2 may be efficiently computed by 
solving the following system of equations by substitution 
(since R is triangular): 

y = R^u (19) 



IV. Simulation Framework 

We evaluate the reconstruction capabilities of an algo- 
rithm by use of the phase transition measure [11] which 
provides the following framework. Let k denote the num- 
ber of non-zero entries in the true signal x. Define the 
measure of indeterminacy 6 = n/N and the generalised 
measure of sparsity (density) p = k/n. Given a success 
criterion, the success rate is then evaluated on the phase 
space ((5, p) G [0, 1]^. In general, reconstruction is easier 
for larger S and smaller p and becomes increasingly diffi- 
cult when decreasing S and increasing p. Somewhere in- 
between, the phase transition curve separates the phase 
space into a phase where reconstruction is likely and a 
phase where it is unlikely. This phase transition curve is 
continuous in S for fixed N. Obviously, it is desirable to 
have a phase transition curve which is as close to p = 1 
as possible. 

Different suites of problems, i.e. different combina- 
tions of ensembles of A and x generally result in dif- 
ferent phase transitions. Choosing A from the Uniform 
Spherical Ensemble (USE) and the non-zero entries in 
x from the Rademacher distributed generally yields the 
most difficult problem suite in terms of obtaining good 
phase transitions [4]. In the simulations, we consider 
this problem suite along with a the problem suite where 
A is chosen from the USE and the non-zero entries 
in X are chosen from the zero mean, unit variance 
Gaussian ensemble. In [19], it is shown that the prob- 
ability of reconstruction versus p for fixed N and 5 
can be modelled accurately by logistic regression for 
a specific set of algorithms. Logistic regression is used 
more generally in [4] to determine the location of the 
phase transition curve. We adopt the logistic regression 
approach to estimate the location of the phase transition 
curve and fix = 800 as proposed in [4]. We then 
attempt reconstruction on a uniform grid {S,p) in the 
phase space specified by: 



followed by computing: 

X = Qiu (20) 

Avoiding the explicit forming of A^ eliminates the 
need for computing the inverse of which becomes 
progressively more computationally expensive as 5 — )• 1. 
On the other hand, only the reduced QR decomposition 
a""" = QiR is needed when A^ is explicitly formed. 
The reduced QR decomposition becomes progressively 
less computationally expensive than the full QR decom- 
position as (5 ^ 0. These two observations motivate the 
change of method at S = 1/2. 



5 G {0.025,0.05, ...,1.00} (21) 
pG {0.01, 0.02,..., 1.00} (22) 

For each point in the grid, we do 10 Monte Carlo 
simulations where each simulation features a new draw 
of A and x. From [4] we adopt that an attempted 
reconstruction is considered successful when: 

^^W^ < 10"' (23) 

where x and x are the reconstructed and true signal, 
respectively. If the criterion is not met, the attempted 
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reconstruction is considered unsuccessful, i.e., the at- 
tempted reconstruction cannot be considered indetermi- 
nate. 

To evaluate the required computation time for the 
different algorithms, we measure the absolute time spent 
on reconstruction when it succeeds. The problem suite 
formed by choosing A from the USE combined with 
Rademacher distributed non-zero entries in x is used in 
this test. A uniform grid ((5, p) in the phase space is 
formed by: 

(5 e {0.1, 0.2,..., 1.0} (24) 
{0.1, 0.2,..., 1.0} (25) 

An algorithm is tested on all points in the grid that are 
at least 0.025 below its empirically determined phase 
transition (measured on the p-axis). The time spent on 
reconstruction for the successful part of 10 Monte Carlo 
simulations in each point is then averaged. Considered 
problem sizes are: 

N G {800, 1600, 3200, 6400, 12800} (26) 

The simulations have been conducted on an Intel Core 17 
970 6-core 3.2 GHz based PC with 24GiB DDR3 RAM. 
The OS used is 64-bit Ubuntu 12.04 LTS Linux and the 
Enthought Python Distribution (EPD) 7.2-2 (64-bit). All 
simulations are carried out in double precision. 

To validate the results obtained from our simulation 
framework, we have simulated the Iterative Hard Thresh- 
olding algorithm presented in [4]. The phase transition 
obtained in our simulation framework has then been 
compared with the phase transition obtained in the sim- 
ulation framework of [4]. Due to the non-deterministic 
nature of the Monte Carlo simulations used in both 
simulation frameworks, the two phase transitions will 
inevitably differ slightly. However, we have observed that 
they are almost identical and therefore concluded, that 
our simulation framework works as intended. 

V. Experimental Results 

Four algorithms have been simulated: 1) SLO STD which 
is the SLO algorithm presented in Section II. 2) SLO MIN 
which is the same algorithm except it is modified such 
that L is multiphed by 1.7 each time a is decreased. 3) 
SLO MSS which is the algorithm presented in Section 
II-A. 4) IHT which is the Iterative Hard Thresholding 
algorithm described in [4]. In the case of the SLO MSS 
algorithm, two implementations have been simulated: 
The SLO MSSI implementation based on (12) and (13) 



and the SLO MSSII implementation based on (14) and 
(15). 

The experimental results are presented in Figure 1, 
2, 3, and 4. Figure 1 shows the phase transitions for 
Rademacher distributed non-zero entries in x while 
Figure 2 shows the phase transitions for zero mean, 
unit variance Gaussian non-zero entries in x. In both 
figures, the theoretical £i curve from [20] is included 
for reference. Figure 3 shows the measured average 
computation times versus indeterminacy 6 = n/N and 
Figure 4 shows the measured average computation times 
versus problem size N. Note the abrupt ending of the 
SLO STD curve in Figure 3 which is due to failure of 
reconstruction in the tested grid for 5 < 1/2. Also, note 
that the measured computation times of the SLO MIN 
implementation have been divided by 20. 

In summary, SLO MSS shows the best phase transition 
among the tested algorithms for both Rademacher and 
Gaussian non-zero entries in x. In large portions of 
the phase space it even surpasses the theoretical £i 
curve. The only exception is in the Gaussian case for 
6 < 0.2 where IHT shows better phase transition. 
Regarding computation time, IHT is faster than the SLO 
approaches among which SLO MIN is consistently more 
than 20 times slower than SLO MSS. Furthermore, IHT 
scales slightly better with problem size N than the SLO 
approaches. 

VI. Discussion 

For Rademacher non-zero entries in x. Figure 1 reveals 
that SLO STD, SLO MIN, and IHT are by far outper- 
formed by SLO MSS in terms of phase transistion. Even 
the theoretical £i curve is surpassed by SLO MSS at 
around S > 0.3. For S < 0.3, SLO MSS shows the same 
phase transition as the theoretical £i curve. The curve for 
SLO MIN is a clear example of the improvement in phase 
transition obtainable using more iterations in the inner 
loop of the SLO algorithm. However, SLO MSS further 
improves on this, especially for 5 < 0.1 and S > 0.3. 

The results in Figure 3 settle that SLO does indeed 
require more computation time than IHT. IHT is around 
two to four times faster (depending on S) than the fastest 
SLO implementation for a problem size of = 3200. 
The important thing to note though, is that SLO provides 
a trade-off between phase transition and computation 
time. The price paid in computation time for using a lot 
of iterations to get better phase transition is clear from 
the SLO MIN curve. This is, however, not the case for 
SLO MSS, which requires less than or about the same 
computation time as SLO STD depending on S. Thus, a 
much better phase transition is obtained using largely the 
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Fig. 1: Phase transitions for the problem suite: A from the USE, x 
with non-zero Rademacher entries. The theoretical £i curve is from 
[20]. 
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Fig. 2: Phase transitions for the problem suite: A from the USE, 
X with non-zero zero mean, unit variance Gaussian entries. The 
theoretical £i curve is from [20]. 




same computation time in going from SLO STD to SLO 
MSS. To obtain such a result, it is necessary to switch 
from SLO MSSI to SLO MSSII at around 5 = 1/2. 

An assessment of the scaling of average computation 
time with problem size N reveals that all three SLO 
algorithms seem to scale in an equivalent way. IHT 
scales better than SLO and hence requires relatively less 
computation time as the problem size N increases. The 
scaling depicted in Figure 4 is for (5 = 1/2 which 
provides a rough average computation time across all 
values of S as can be seen from Figure 3. 

The parameters for SLO IVISS stated in Section II-A 
are (locally) optimal in terms of phase transition for 
Rademacher non-zero entries in x. Gaussian non-zero 
entries in x are known to be in favour of greedy 



algorithms [14], which is also the case for IHT in our 
simulations, especially for 6 < 0.2 where the IHT curve 
surpasses the theoretical ii curve. For 6 > 0.2, SLO 
MIN and SLO MSS demonstrate the best phase transition 
among the shown algorithms. SLO MIN and SLO MSS 
phase transitions are about the same, though. Comparing 
our results for SLO MSS in Figure 2 with the ones given 
for SLO in Figure 6 in [14] shows about the same phase 
transition. The slightly better phase transition for 5 < 0.2 
in [14] may be due to the SLO MSS parameters not 
necessarily being optimal for Gaussian non-zero entries 
in X. 

Although the above simulations are quite encouraging, 
they are based on an empirically tuned algorithm. Thus, 
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to reach a final verdict of the success of SLO MSS, the 
validity of the simulation results must be exhaustively 
studied for a broader set of problem suites. Alternatively, 
more sound mathematical proofs must be presented. 

VII. Conclusions 

We have proposed a new compressive sensing recon- 
struction algorithm named SLO MSS based on the 
smoothed £o norm. It turns out that SLO phase transi- 
tions heavily depend on parameter selection. SLO MSS 
attempts to improve on phase transition by exploiting 
the known indeterminacy 6 = n/N combined with 
carefully selected parameters. A trade-off between phase 
transition and computation time is provided by SLO. 
Improved phase transition has been measured for SLO 
MSS compared to standard SLO while maintaining the 
same computation time. 
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